Analysis of CFA-BF: Novel combined fixed/adaptive beamforming for robust speech recognition in real car environments

نویسندگان

  • John H. L. Hansen
  • Xianxian Zhang
چکیده

Among a number of studies which have investigated various speech enhancement and processing schemes for in-vehicle speech systems, the delay-and-sum beamforming (DASB) and adaptive beamforming are two typical methods that both have their advantages and disadvantages. In this paper, we propose a novel combined fixed/adaptive beamforming solution (CFA-BF) based on previous work for speech enhancement and recognition in real moving car environments, which seeks to take advantage of both methods. The working scheme of CFA-BF consists of two steps: source location calibration and target signal enhancement. The first step is to pre-record the transfer functions between the speaker and microphone array from different potential source positions using adaptive beamforming under quiet environments; and the second step is to use this pre-recorded information to enhance the desired speech when the car is running on the road. An evaluation using extensive actual car speech data from the CU-Move Corpus shows that the method can decrease WER for speech recognition by up to 30% over a single channel scenario and improve speech quality via the SEGSNR measure by up to 1 dB on the average. 2009 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive beamforming and soft missing data decoding for robust speech recognition in reverberant environments

This paper presents a novel approach to combine microphone array processing and robust speech recognition for reverberant multi-speaker environments. Spatial cues are extracted from a microphone array and automatically clustered to estimate localization masks in the time-frequency domain. The localization masks are then used to blindly design adaptive filters in order to enhance the source sign...

متن کامل

Optimizing regression for in-car speech recognition using multiple distributed microphones

In this paper, we address issues in improving handsfree speech recognition performance in different car environments using multiple spatially distributed microphones. In previous work, we proposed multiple regression of the log-spectra (MRLS) for estimating the logspectra of speech at a close-talking microphone. In this paper, the idea is extended to nonlinear regressions. Isolated word recogni...

متن کامل

A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments

A new speech enhancement scheme is presented integrating spatial and temporal signal processing methods for robust speech recognition in noisy environments. The scheme first separates spatially localized point sources from noisy speech signals recorded by two microphones. Blind source separation algorithms assuming no a priori knowledge about the sources involved are applied in this spatial pro...

متن کامل

Spatio-temporal Speech Enhancement for Robust Speech Recognition

A new speech enhancement scheme is presented integrating spatial and temporal signal processing methods for robust speech recognition in noisy environments. The scheme first separates spatially localized point sources from noisy speech signals recorded by two microphones. Blind source separation algorithms assuming no a priori knowledge about the sources involved are applied in this spatial pro...

متن کامل

A spatio-temporal speech enhancement scheme for robust speech recognition

A new speech enhancement scheme is presented integrating spatial and temporal signal processing methods for robust speech recognition in noisy environments. The scheme first separates spatially localized point sources from noisy speech signals recorded by two microphones. Blind source separation algorithms assuming no a priori knowledge about the sources involved are applied in this spatial pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2010